A visual Turing test for computer vision systems
نویسنده
چکیده
Today, computer vision systems are tested by their accuracy in detecting and localizing instances of objects. As an alternative, and motivated by the ability of humans to provide far richer descriptions, even tell a story about an image, we construct a ”visual Turing test”: an operator-assisted device that produces a stochastic sequence of binary questions from a given test image. The query engine proposes a question; the operator either provides the correct answer or rejects the question as ambiguous; the engine proposes the next question (“just-in-time truthing”). The test is then administered to the computer-vision system, one question at a time. After the system’s answer is recorded, the system is provided the correct answer and the next question. Parsing is trivial and deterministic; the system being tested requires no natural language processing. The query engine employs statistical constraints, learned from a training set, to produce questions with essentially unpredictable answers— the answer to a question, given the history of questions and their correct answers, is nearly equally likely to be positive or negative. In this sense, the test is only about vision. The system is designed to produce streams of questions that follow natural story lines, from the instantiation of a unique object, through an exploration of its properties, and onto its relationships with other uniquely instantiated objects.
منابع مشابه
Can you fool AI with adversarial examples on a visual Turing test?
Adversarial attacks are known to succeed on classifiers, but it has been an open question whether more complex vision systems are vulnerable. In this paper, we study adversarial examples for vision and language models, which incorporate natural language understanding and complex structures such as attention, localization, and modular architectures. In particular, we investigate attacks on a den...
متن کاملComputer assisted instruction during quarantine and computer vision syndrome
Computer vision syndrome (CVS) is a set of visual, ocular, and musculoskeletal symptoms that result from long-term computer use. These symptoms include eyestrain, dry eyes, burning, pain, redness, blurred vision, etc, which increase with the duration of computer use. Currently, with the closure of schools and universities due to the continued COVID19 pandemic many universities have taken the pr...
متن کاملVisual Turing test for computer vision systems.
Today, computer vision systems are tested by their accuracy in detecting and localizing instances of objects. As an alternative, and motivated by the ability of humans to provide far richer descriptions and even tell a story about an image, we construct a "visual Turing test": an operator-assisted device that produces a stochastic sequence of binary questions from a given test image. The query ...
متن کاملBayesian perspective over time
Thomas Bayes, the founder of Bayesian vision, entered the University of Edinburgh in 1719 to study logic and theology. Returning in 1722, he worked with his father in a small church. He also was a mathematician and in 1740 he made a novel discovery which he never published, but his friend Richard Price found it in his notes after his death in 1761, reedited it and published it. But until L...
متن کاملA Novel Approach to Background Subtraction Using Visual Saliency Map
Generally human vision system searches for salient regions and movements in video scenes to lessen the search space and effort. Using visual saliency map for modelling gives important information for understanding in many applications. In this paper we present a simple method with low computation load using visual saliency map for background subtraction in video stream. The proposed technique i...
متن کامل